Desian Enaineerina 



Medium-speed multipliers trim cost, 
shrink bandwidth in speech transmission 



Dedicated multiplier ICs can help you steer clear 
of the high cost and wide bandwidth require- 
ments of digitally transmitting human voice 
signals. What's more, you won't even need the fastest 
multiplier chips — medium-speed devices give surpris- 
ingly high speed at much lower cost than their high- 
speed, parallel-type counterparts. 

Digital voice transmission is usually accomplished 
with pulse-code modulation techniques (PCM), by 
sampling an analog speech signal at an 8-kHz rate with 
8-bit resolution. But that requires a very broad 
bandwidth— 64 kbits/s. There is an alternative. With 
a speech compression system, you can knock band- 
width down to just 2.4 kbits/s. But to design such a 
system, you first must understand how to interface 
a dedicated multiplier to a microprocessor. 

Speech compression is one of the newest processes 
to take advantage of a medium-speed dedicated multi- 
plier. However, anytime you have a design problem 
that doesn't call f or the under 200-ns speeds of parallel 
multipliers, yet demands faster multiplication than 
you get from a traditional add-and-shift software 
algorithm, look to a medium-speed chip. 

You should already have a good idea of the per- 
formance you can expect from your nP. Before you 
attempt to design a /xP-multiplier interface for a 
speech-compression system, you should have an equal- 
ly good idea of what a multiplier can do. 

Medium speed doesn't mean slow 

Medium-speed multipliers, such as an AMD 
25LS2516 (Advanced Micro Devices) or a MMI 67508 
(Monolithic Memòries), multiply two 8-bit numbers 
in a mere 400 ns— worst case. As you might expect, 
a 16-bit multiplier like MMFs 67516 needs twice that 
time, 800 ns — also worst case. If you must multiply 
even faster, check out other devices and the special 
techniques explained in "Dedicated Multiplier ICs 
Speed-Up Processing In Fast Computer Systems," 
(ED No. 19, September 13, 1978, pp. 98-103). 



Shlomo Waser, Product Planning Manager, Monolithic 
Memòries, Inc., 1 165 E. Arques Ave., Sunnyvale, CA 94086 
and Dr. Alien Peterson, Professor of Electrical Engineer- 
ing, Stanford University, Stanford, CA 94305. 



Most medium-speed multipliers operate in sequen- 
tial fashion, which means they generate partial prod- 
ucts step-by-step. By contrast, the faster parallel 
devices generate products in a single step. But high- 
speed multipliers, which receive operand inputs in 
parallel, are expensive and large (they're built in 40- 
pin packages). Medium-speed devices are housed in 
24-pin packages, and their cost is much easier to 
justify in designs that call for a dedicated multiplier. 
Another point to keep in mind is that you'll get double- 
duty out of many multiplier chips because they also 
perform division. 

In a 67516, two operands are loaded into registers 
in a time sequence. The device then jumps to either 
the multiply or divide routine to carry out the 
arithmetic process its instruction calls for. 

To understand the bàsic operation of this chip, look 
at the block diagram in Fig. 1. The device contains 
four working registers: The Y (multiplier), X 
(multiplicand/divisor), W (least-signif icant half of the 
double-length accumulator) and Z (most-significant 
half of the double-length accumulator). The last two 
registers are usually grouped together as the W-Z 
register, and operate as a working register for in- 
termediate results. In addition, the W-Z register stores 
the final double-length product in multiplication or 
the quotient in division. 

Final products or quotients are placed on the output 
bus in a time sequence, after the expiration of the 
number of clock cycles required to complete the 
operation. Multiplication requires eight clock-cycles, 
while division needs 20 — both measured from the time 
operands are first loaded into the X and Y registers. 

Three instruction lines, I , li and I 2 select the 
function that the 67516 performs. You have a choice 
of sixteen multiply and seven divide options. For your 
option to be exercised, instructions must come from 
the microprocessor, so you'll have to know how jiP 
instructions command a multiplier. 

Make it easy on the fiP 

One of the primary benefits gained from using a 
dedicated multiplier is a savings of processor time. 
Your system will be capable of multiplying with a 
minimum of ^P instructions since a dedicated multi- 




Y REGISTER 



INSTRUCTION 



SEQUENCER 



"> CONTROLS 

V 



SHIFT 
MUX 








X REGISTER 



Z REGISTER 




W REGISTER 




fe i-^ : 



16-BIT HIGH-SPEED ALU 




TO 
SHIFT 
MUX 



1. Four working registers and a 16-bit high-speed ALU are 

the heart of Monolithic Memòries' 67516 16-bit multiplier 

plier needs far fewer instructions than if multi- 
plication were to be performed in the processor's 
arithmetic lògic unit (ALU). 

Here's how a 67516 multiplier operates under proc- 
essor command to perform the familiar sum-of-prod- 
ucts operation represented by the expression, 

n 

2 X,Y, 



IC. The sequencer coordinates the three-line instruction 
code, which determines how the ALU operates on data. 



OPERATION 



TIME SLOT (cycle) 
4 22 23 



z. w/x 


INS CODE 


6 6 4 


DIVIDE 


BUS 


X Z W 



2. Muitiplication codes and computation times show a 
sum-of-products operation in a 67516 chip. The second 
operation is the same as the first, only with a negative 
muitiplication operation. 



You'U need instruction codes sent from the mP to 
the multiplier (see Fig. 2 for a general scheme). An 
instruction code of 5 or 6 results in loading the first 
operand into the X register, depending on whether the 
operand is an integer or fractional. An instruction code 
of follows, which telis the 67516 to load the next 
operand on the bus into the Y register. For the next 
eight clock cycles, the two operands are multiplied 
together, with the product entering the W-Z register. 
Note that your microprocessor can attend to other 
business during this time. 

At the conclusion of the first muitiplication, (end 
of the tenth clock cycle), two more instructions must 
be issued to the multiplier. An instruction code of 6 
loads the next operand into the X register, and an 
instruction code of 2 telis the device to multiply and 
add the result to the contents of the W-Z register. 
Again, eight clock cycles must elapse before this result 
appears in the W-Z register. 

Naturally, you can continue to issue 6 and 2 instruc- 



TIME SLOT (CYCLE) 



OPERATION 




1 


2 


3 10 


11 


12 


13 20 


XY 1 K2 Kyy 


INS CODE 


6 





MULTIPLY 


6 2 


MULTIPLY 


BUS 


X 


Y 


X, 1 


Vi ■ 1 


-XY K 2i K w 


INS CODE 


6 1 


MULTIPLY 


6 3 


MULTIPLY 


BUS 


X 


Y 


x, , 


Yi . 1 



3. Codes and computation times for a 67516's division 
operation require more time than muitiplication. Division 
takes twenty time-cycles, muitiplication takes eight. 
Division also requires an additional instruction code. 



tions until the entire multiplication-summing process 
is complete. Then the result in the accumulator gets 
placed on the data bus. 

Division and muitiplication are similar— division 
just takes longer. The instructions and timing require- 
ments for dividing a double-length dividend by a 
single-length divisor are shown in Fig. 3. For division, 
the processor must issue three instructions instead of 




4. Interfacing a dedicated multiplier to a 16-bit micro- 
processor is a simple process. The least-significant three 
bits of the CPU instruction word forms the multiplier's 
instruction code. The remaining address bits determine 
whether the multiplier is selected. 



two, and the divide cycle takes 20 time-slots rather 
than eight. One code is common to both divide and 
multiply operations — a 7 instruction-code reads the 
contents of the W-Z register to the data bus. 

That takes care of the bàsic instructions needed to 
operate a multiplier. The next phase is interfacing — 
how the instructions get from /xP to multiplier. 

Interfacing the multiplier 

A 67516 has only three instruction lines, so it's quite 
simple to connect to any 16-bit microprocessor (see 
Fig. 4). The technique is to assign the multiplier's three 
instruction lines to the three least-significant bits on 
the address bus, and route the bits to the instruction 
input. The remaining bits of the address bus are routed 



through a programmable array lògic (PAL) device that 
acts as a decoder. Decoding these bits determines 
whether the 67516 is selected. 

With this interface, if the multiplier is assigned to 
address location 100, any address in the 100 to 107 
range selects the multiplier by enablingthe chip. What 
actually happens is that the LOAD line goes low. The 
three least-significant bits then represent the instruc- 
tion to be carried out. For example, if the CPU sends 
out address 106, the multiplier carries out the 6- 
instruction code, which is LOAD. Similarly, if instruc- 
tion code 100 is sent out, the multiplier recognizes the 
0-instruction, or multiply. If you want further details 
of this technique, called memory-mapped program- 
ming, refer to the box, "Programming A Multiplier 
From A fiP." 

At this point, you may be considering the tradeoff 
between programming a multiplier or allowing your 
system to multiply under the control of a processor's 
built-in multiply macroinstruction. Many of the new 
16-bit, and some of the 8-bit (iPs have this feature. 
Don't consider too long. If you're after speed, a 67516 
can multiply twenty times faster than the newest 16- 
bit processor using a macroinstruction. 

Multiplier interfacing can also be accomplished with 
a bit-slice microprocessor like AMD's 2901A (see Fig. 
5). This system runs on a fast 100-ns clock, so you 
can use a two-stage pipeline technique, in which the 
first stage is an address register and the second is 
a PROM output register. 

In bit-slice operation, a microinstruction located in 
the PROM telis the 2901A and the 67516 what opera- 
tion to perform. Pipelining allows input data to be 
queued so the 67516's 800-ns multiply time can be 
worked in with the 100-ns system clock time. But this 
design only works well where relatively f ew branching 
decisions are made. 

With the link between the microprocessor and 
multiplier established, the more difficult problem 




5. Another way of interfacing a multiplier is with a bit- processor (a 4-bit slice) and multiplier interfaced directly 
slice microprocessor. This pipeline configuration has both to the 16-bit bus. 



TBANSMISSION 
LINE 













6. A speech compression system breaks down into two 
subsystems — analysis and synthesis. 



WHITE NOISE 
GENERATOR 



PITCH IMPULSE 
GENERATOR 



VOCAL-TRACT 
OICEDO MODEL 

Sd gak>- L_ !^-FaH I 



7. Speech synthesis, shown in this digital model of a 
narrowband speech synthesizer, has three varying param- 
eters— filter coefficients (a^, gain and pitch period, 

comes to the forefront — compressing speech signals 
so they can be digitally transmitted. 

Fundamentals of speech 

Speech signals are produced by relatively slow 
movements of human vocal cords. This allows them 
to be described by parameters having a much lower 
information rate than the 64-kbit/s of PCM. Although 
speech compression circuits operate at a slow 2.4- 
kbits/s, you'll still have to do a lot of number 
crunching to compute the necessary parameters in 
real time. Once again, speed becomes essential — that 
means you'll need a dedicated multiplier to assist the 
microprocessor. 

The algorithms for speech compression are fairly 
complex, so it's best to break the subject into two parts 
—speech synthesis and speech analysis (see Fig. 6). 
Synthesis is the easier of the two to understand. 

Human speech is characterized as either voiced or 
unvoiced. Vocal cords vibrate to create voiced sounds 
like 1, m or ee, and the rate at which the vocal cords 
open and close determines the pitch of the generated 
sounds. Unvoiced sounds like s, f and sh have no 
definite pitch and are generated by noise excitation 
of the vocal tract, which is produced by air flow out 
of the lungs. 

Speech synthesis takes advantage of the fact that 
the speech waveform is almost constant within a time 
frame of 22.5 ms. Over this short time interval, a 
speech synthesizer is simply a digital filter, also called 
a voice tract filter. And the filter can be driven by 
one of two sources. 

A voiced sound produces drive from a constant 
frequency (pitch) pulse. Unvoiced sound represents 
drive from a white noise generator. A speaker driven 
by a d/a converter generates the actual speech (see 
Fig. 6)— the converter is updated every 125 ^s, an 8- 
kHz rate. Acceptable speech reproduction requires at 



Programming a multiplier from a /JP 

With the /uP-multiplier interface technique il- 
lustrated in Fig. 4, you can write a program that not 
only allows the fiP to select the multiplier but telis 
it what operation to perform. 

If you've assigned the dedicated multiplier (a 67516) 
to address 100, then any address code from 100 
through 107 selects the multiplier. Let's say you have 
stored the two numbers to be multiplied in address 
locations 108 and 109, and you want the double-length 
result placed in addresses 110 and 111. Finally, assume 
that the mP move instruction takes the general form, 

MOV SA, DA, 

where sa is the source address and daís the destination 
address. The following segment of assembly language 
code will cause the multiplier to perform the move 
operation: 

Assembly Program 67516 Instruction 

mov 108,106 lo ad X (instruction 6) 

mov 109,100 lo ad Y (instruction 0) 

NOP MULTIPLY 

mov 107,110 Read 16 MSB of product 

(instruction 7) 
mov 107,111 Read 16 LSB of product 

(instruction 7) 
When you write your program, you may not need 
the nop instruction, depending on the speed of your 
processor. Without a nop, some proeessors have more 
than a 800-ns delay between the write pulse of the 
second instruction and the read pulse of the fourth 
instruction. If your fiP is this slow, omit the nop. But 
if you use a very fast processor, you may need one 
or more nops to insure that the 67516's 800-ns through- 
put time is met. 



least a 4-kHz bandwidth. 

Speech synthesis is illustrated by the narrow-band 
speech synthesizer in Fig. 7. From this, a vocal tract 
filter model with the following transfer function 
emerges: 



H (z) = G 



1 



1U 

S.a.Z- 



] 



(D 



= Eo/E, _ 

where G is the gain (amplitude), 

a k is one of 10 coefficients of the filter, 
z _k is one of ten past vàlues of the output. 

The filter's transfer function allows you to find the 

impulse response, 

E = GE, + a.EoZ" 1 + a 2 E Z" 2 + .... + a 10 E o Z- 10 . 

(2) 

Here, E Z _1 is the previous value of the output and 
E, is the filter input at the pitch frequency. The filter 
output is a linear combination, that is, linearly predic- 
table from its ten previous vàlues. This speech process- 
ing scheme is called LPC-10, or Linear Predictive Code 



SIN (1) 



1.0 
0.5 
0-< 




\l I I I l/l I I I l> 


t I I I I l/i I I I l\ I I 




-0.5 
-1.0 


MIMI 






I I f \ I 



SIN (1-1) 





i m i\ i m 


i /i 1 1 1 1\ 1 1 1 


ll/ I I I I l\ I I 


I I l/l I 













R(n) 

ooi/" (0) 

' é PITCH PERIOD — »| 

0.4V-R(1) 

(«) o J \ I I II l/l M I l\ l I I I l yÍMMSa i I \ \ jrfTVH . 

~i I I I iTi i I I i I I l I I , I I I I Mil». n 

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 

8. In an autocorrelation function (ACF), the top sinusoid 
is multiplied by the middle one. The middle waveform is 
identical to the top, but delayed by one time unit. This 
product gives the R(0) coefficient shown in the bottom 
waveform. The ACF is composed of all coefficients, R(0) 
through R(n). The pitch period is the delay measured f rom 
the start of the ACF to the first peak. 



using 10 coefficients. 

The implication of Eq. 2 is that to synthesize speech, 
you only need eleven multiply and accumulate opera- 
tions every 125 us. Therefore, each multiply and 
accumulate can be as slow as 11.3 us (125/11), and 
yield speech of acceptable quality. For this reason, 
speech synthesizers can be implemented with one of 
the older, slower semiconductor processes — PMOS 
(see the article "Here's a Breakthrough — A Low-Cost 



Speech Synthesizer On A Chip," ED No. 15, July 19, 
1978, p. 32). 

Although the speech waveform remains constant in 
a 22.5 ms time frame, within each time frame you 
must update the following parameters: 

1. The pitch period (for a voiced sound). 

2. The ten coefficients, a 1 through a 10 . 

3. The gain factor, G. 

Some speech synthesizers store these parameters 
in a large ROM (256 kbits). At 2400 bits/s, that's 
equivalent to a speech duration of more than 100 s. 
But if you want to do narrowband speech trans- 
mission, you have to compute parameters on the fly, 
in real time. And that puts you into the province of 
speech analysis. 

Speech analysis— delaying tàctics 

To determine the pitch period and the filter coeffi- 
cients requires some higher mathematics — the auto- 
correlation function (ACF). The ACF coefficients are 
used to compute the pitch period and the ten coeffi- 
cients. It turns out that the n th coefficient of the ACF 
results from multiplying the speech waveform of one 
time frame by the same waveform delayed by n- 
samples (see Fig. 8). 

At the top is a stored waveform (a sinusoid), and 
below it is the same sinusoid delayed by one time unit. 
The bottom waveform represents the collection of all 
ACF coefficients. Coefficient R(O) in the bottom 
waveform results from the top sinusoid being multi- 
plied by itself. And R(l) results from the top signal 
being multiplied by the second waveform (1 unit 
delayed). Each new coefficient, R(n), is formed as the 
top waveform shifts n-units to the right. As this 
occurs, n-zeros are added on the left. The declining 




peaks of the ACF are caused by the shifting-in of zeros, 
from the left. 

In this application, a single time frame contains 180 
samples. You arrive at this number by dividing the 
22.5 ms time frame by the 125 us sampling period. 
In practice, it's usually not necessary to compute the 
entire ACF of a sample block (22.5 ms), since pitch 
periods normally fall in the range of 3 to 12 ms (pitch 
freqüències of 80 to 300 Hz). Specif ically, it's suff icient 
to compute only the first 100 R's (called lags in speech 
jargon) of the ACF. 

To understand this better, examine the ACF for 180 
samples with 100 lags. For n = to 99 or 100 lags, 



Here, Sj is the j th sample of a speech waveform and 
R(n) is the n th ACF coeff icient. To see how long it takes 
to compute 100 lags, Eq. 3 must be expanded. 

R(0) =1/180 (S S„+ S,S, + S 2 S 2 + . . . + S 180 S 180 ) 
R(l) =1/180 (S„0 + S.So + SA + . . . + S 180 S 179 ) 



R(99) = 1/180 (S O + S 2 + . . . + S 18 „S 81 ). 

The first thing to notice about this expansion is that 
R(0) requires 180 multiply and accumulate operations, 
but R(99) needs only 80 such operations because of 
the shifted-in zeros. The average number of operations 
is 130 or (180 + 80)/2. Multiply the average number 
of operations by 100 lags and you get a total of 13,000 
operations. If each multiply and accumulate can be 
accomplished in 1 us, it will take only 13 ms to compute 
all 13,000 coefficients. That's well within the 22.5 ms 



time frame, so it appears that speech signals can be 
processed in real time. 

In the days before dedicated multipliers, an ACF 
couldn't be computed this accurately. In fact, multi- 
plication had to be avoided, so the best ACF was only 
an approximation produced by other methods. 

An accurate ACF is a very necessary tool. You can 
use it to find the pitch period, coefficients and gain 
of a speech waveform. 

Spinoffs from the ACF 

The pitch period of a speech waveform is defined 
as the delay of the secondary peak relative to the origin 
of the ACF (shown in Fig. 8). For example, the absence 
of marked secondary peaks in Fig. 8 telis you that 
this speech segment is unvoiced. Finding the secon- 
dary peak is another matter entirely. 

You find a secondary peak by repetitive subtrac- 
tions of all ACF coefficients — storing the largest value 
while scanning all 100 lags. These 100 subtractions 
are performed by a dedicated multiplier like the 67516. 
It multiplies each operand by 1, then subtracts the 
value from the accumulator. Each subtraction takes 
about 1 jus, which may seem slow, but all 100 subtrac- 
tions take only 0.1 ms— insignificant within a 22.5 ms 
time frame. Even better, a dedicated multiplier per- 
forms the entire operation without any additional 
support from adders or the ALU. 

The computation of f ilter coefficients isn't as simple 
as finding the pitch period. A set of linear equations 
specif ies the ten f ilter coefficients. 
For i = 1 to 10, 

10 

a k = 2 a K R(i - K) (4) 

R=l 




10. A bit-slice implementation of the blocks in Fig. 9 of the calculations necessary to perform the speech 
requires the 67516 multiplier chip's full capability. Most synthesis function are done in the multiplier. 



a k = R (i) 



Expanding this summation, you get, 

aiR(O) + a 2 R(l) + a 3 R(2) + . . . + a 10 R(9) = R(l) 
a x R(l) + a 2 R(0) + a 3 R(l) + . . . + a 10 R(8) = R(2) 



a,R(9) + a 2 R(8) + a 3 R(7) + . . . + a IO R(0) = R(10), 

where the a k 's are the 10 unknowns in this set of 
equations. It's possible to solve these equations in a 
few seconds with a general purpose computer. But 
who's got a few seconds? You have to solve this 
problem in a few milliseconds at most. The solution 
is an iterative method whose complete details can be 
found in the box "The Levinson/Durbin Recursion." 
With the aid of this algorithm, it's possible to compute 
all coefficients in the remarkable time of 0.35 ms. 

Finally, you must compute the gain factor, G. This 
is a fairly simple problem that takes just 20 /us to 
perform with the following equation: 



G 2 = R(0) - 2 a k R(K). 

At this point, only one thing really matters. With 
all the required computations, will the speech com- 
pression system operate in real time? 

Total the time 

The answer is found by adding together all the 
elements that make up the computation time for a 
single frame. If the total time used comes in under 
the maximum of 22.5 ms, the system will operate in 
real time. 

For speech synthesis: 1.98 ms (180 outputs X 11 us) 
For speech analysis: 

autocorrelation 13.0 ms 

pitch period 0.1 " 

filter coeff. 0.35 " 

gain 0.02 " 

Grand total: 15.45 ms 

Only 15.45 ms of the 22.5-ms time frame has been 
used for all the computations. You've got a com- 
fortable margin of 7 ms left over for serializing, 



The LevinsoryDurbin recursion 

A major problem in speech analysis/synthesis is the 
determination of the vàlues of ten filter coefficients 
in real time. These coefficients are related to the 
autocorrelation function's coefficients by the ten line- 
ar equations shown in the text as Eq. 4. 

One of the advantages of using the linear predictive 
method to design speech compression systems is the 
availability of a very efficient iterative algorithm for 
finding the ten coefficients— the Levinson/Durbin 
recursion. To solve the ten equations in ten unknowns, 
start with the Levinson/Durbin definition, 
i-i 

2 aj (i-j)R(i-j)-R(i) 

k 4 =í2 

En 

a,(i) = -k, 

aj(i) = aj(i-l) + ktai-jO-l) 
E, = E.-.d-k, 2 ). 

This set of equations is solved recursively for i = 1, 
2, . . . 10, with the final solution given by, 
aj = a/10) where j = 1, 2, .... 10. 
The recursion is started by E = R(0), and from this 
you find k,. 

kl = _ Mi = _ Mi 

E„ R(0) 
Then k! and E are used to compute E! and ai(l). 

a/1) = -k, 

E, = E (l - k, 2 ) 
The iteration for k 2 is similar to that for k,: 



^ _ a,(l)R(l) - R(2) 
E, 

E 2 =• E/1 - k 2 2 ) 

ai (2) = ai (l) + k 2ai (l). 
Continue the iteration process until you get a value 
for k 10 : 
km = 

ai (9)R(9) + a 2 (9)R(8) + . . . + a/9)R(l) - R(10) 

e; 

Note that k 10 is computed by ten multiply-and- 
accumulate operations and one division. To compute 
E 10 , you'll have to perform three multiplications and 
one addition operation. And a/1) through a/9) take 
nine multiplications and nine additions. It turns out 
that the tenth iteration can be computed in just 35 
yus. For simplicity, assume that every iteration takes 
35 us — that's not strictly true, since it doesn't take 
as much time to do the first (kJ as the last (k 10 ). 
However, by that Standard, it would take only 350 
ms to do all ten coefficients. So the actual time is really 
less than 350 ^s. Now the ten filter coefficients are, 
a, = 8,(10) = a/9) + k 10 a,(9) 

a 2 = a/10) = 8,(9) + k l0 a 8 (8) 



a„ = a 9 (10) = a 9 (9) + k 10 a,(9) 
<*io = a 10 (10) — — k[ . 



encoding and other operations. All that remains of this 
analysis is to encode each 22.5-ms speech segment into 
the number of bits necessary to operate the system 
at 2400 bits/s. 

Within 1 s, there are 44.5 time frames, each 22.5 
ms long. At a rate of 2400 bits/s, each time frame 
holds 54 bits (2400/44.5), so the filter coefficients, gain, 
pitch and any other data must be encoded into 54 bits 
of information. Here's how the 54 bits are apportioned: 



Filter coefficients a x through a 4 5 bits each 

Filter coefficients a 6 through a„ 4 bits each 

Filter coefficient a 9 3 bits 

Filter coefficient a 10 2 bits 

Gain 5 bits 

Pitch and other data 8 bits 

Total 54 bits 



Now you've completed the examination of a speech 
compression system that transmits voice signals at 
2400 bits/s rather than the 64 kbits/s previously 
required. What you need to know is what a practical 
speech compression system looks like. 

Speech processing architecture 

To send voice signals over transmission lines, you 
need a speech analysis circuit at the transmitting 
station and a speech synthesizer at the receiving end. 
(See Fig. 9 for a more detailed speech compression 
system than shown in Fig. 6.) 

Both the analyzer and synthesizer circuits can be 
implemented in the same way— with a micro- 
processor, dedicated multiplier, and memory and 
microprogramming chips (as shown in the system 
block diagram of Fig. 10). The processor operates via 
a single data bus to take advantage of the multiplier's 
single-bus structure. And, as you can see, memory 
chips make up a large portion of the system. 

Main memory, 2048 words x 16 bits, is a TMS 4048 
chip (Texas Instruments). Supporting the main memo- 
ry is a 16-word X 16-bit high-speed buffer memory 
(74LS219, also TI). And the microprogram memory, 
which accommodates up to 4096 words, is composed 
of eleven chips— each a 1-k X 4 PROM (Monolithic 
Memòries 63RS441). The processor itself, a 2910 micro- 
program controller (AMD), handles the 4096 words of 
the PROM. If necessary, you can expand the micro- 
program memory to include more bits/word, or to 
increase the number of control-instruction words. The 
system, as shown, operates in a 100-ns microcycle 
time. 

This speech compressor is an example of what 
designing with microprocessors and multipliers de- 
vices can accomplish — a system design that employs 
a total IC count of under thirty chips. Previously, with 
only MSI devices available, you would have needed 
about 200 chips to do the same thing.«« 



Reprinted from Electronic Design-February 1, 1979 



Copyright 1979 Hayden Publishing Co., Inc. 



